A Rule Based Morphological Analyzer and A Morphological Disambiguator for Kazakh Language
نویسندگان
چکیده
Morphological analysis is a very critical issue especially for natural language processing related tasks on agglutinative languages. This study gives the implementation details of a rule-based morphological analyzer of Kazakh language which is an agglutinative language. A detailed computational analysis of Kazakh language morphology such as formalization of alternation and morphotactic rules for Kazakh language is worked out in order to create the morphological analyzer. In the implementation of the morphological analyzer, alternation and morphotactic rules of Kazakh language are represented by two-level morphology rules and Foma finite state compiler is employed. This is the first detailed computational analysis of Kazakh language from morphological view. A word can have more than one morphological parse but only one of its morphological parses is valid in a given sentence. A morphological disambiguator disambiguates words by selecting one of possible parses of words. In this paper, we also present a transformation-based morphological disambiguator for Kazakh language and it is a variation of Brill tagger.
منابع مشابه
Rule Based Morphological Analyzer of Kazakh Language
Having a morphological analyzer is a very critical issue especially for NLP related tasks on agglutinative languages. This paper presents a detailed computational analysis of Kazakh language which is an agglutinative language. With a detailed analysis of Kazakh language morphology, the formalization of rules over all morphotactics of Kazakh language is worked out and a rule-based morphological ...
متن کاملA Rule-Based Morphological Disambiguator for Turkish
Part-of-speech (POS) tagging is the process of assigning each word of an input text into an appropriate morphological class. Automatic recognition of parts-of-speech is very important for high level NLP applications, since it would be usually infeasible to perform this task manually in practical systems. One approach to POS tagging uses morphological disambiguation which selects the most suitab...
متن کاملFinite State Approach to the Kazakh Nominal Paradigm
This work presents the finite state approach to the Kazakh nominal paradigm. The development and implementation of a finitestate transducer for the nominal paradigm of the Kazakh language belonging to agglutinative languages were undertaken. The morphophonemic constraints that are imposed by the Kazakh language synharmonism (vowels and consonants harmony) on the combinations of letters under af...
متن کاملپارس مورف: تحلیلگر صرفی زبان فارسی
In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...
متن کاملFormal models of nouns in the Kazakh language
This paper explains how semantic hypergraphs are used to construct ontological models of morphological rules in the Kazakh language. The nodes within these graphs represent semantic features (morphological concepts) and the edges within represent the relationships between these features. Word forms within the hypergraph structure are described in trees which are converted into linear parenthesi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015